Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simulations that sample segments of trajectories involving events of interest. This is the strategy of Markov state models (MSMs) and related approaches, but such methods suffer from approximation error because the variables defining the states generally do not capture the dynamics fully. By contrast, once converged, the weighted ensemble (WE) method aggregates information from trajectory segments so as to yield unbiased estimates of both thermodynamic and kinetic statistics. Unfortunately, errors decay no faster than unbiased simulation in WE as originally formulated and commonly deployed. Here, we introduce a theoretical framework for describing WE that shows that the introduction of an approximate stationary distribution on top of the stratification, as in nonequilibrium umbrella sampling (NEUS), accelerates convergence. Building on ideas from MSMs and related methods, we generalize the NEUS approach in such a way that the approximation error can be reduced systematically. We show that the improved algorithm can decrease the simulation time required to achieve the desired precision by orders of magnitude.more » « less
- 
            Abstract Mercury’s orbit can destabilize, generally resulting in a collision with either Venus or the Sun. Chaotic evolution can causeg1to decrease to the approximately constant value ofg5and create a resonance. Previous work has approximated the variation ing1as stochastic diffusion, which leads to a phenomological model that can reproduce the Mercury instability statistics of secular andN-body models on timescales longer than 10 Gyr. Here we show that the diffusive model significantly underpredicts the Mercury instability probability on timescales less than 5 Gyr, the remaining lifespan of the solar system. This is becauseg1exhibits larger variations on short timescales than the diffusive model would suggest. To better model the variations on short timescales, we build a new subdiffusive phenomological model forg1. Subdiffusion is similar to diffusion but exhibits larger displacements on short timescales and smaller displacements on long timescales. We choose model parameters based on the behavior of theg1trajectories in theN-body simulations, leading to a tuned model that can reproduce Mercury instability statistics from 1–40 Gyr. This work motivates fundamental questions in solar system dynamics: why does subdiffusion better approximate the variation ing1than standard diffusion? Why is there an upper bound ong1, but not a lower bound that would prevent it from reachingg5?more » « less
- 
            Many chemical reactions and molecular processes occur on time scales that are significantly longer than those accessible by direct simulations. One successful approach to estimating dynamical statistics for such processes is to use many short time series of observations of the system to construct a Markov state model, which approximates the dynamics of the system as memoryless transitions between a set of discrete states. The dynamical Galerkin approximation (DGA) is a closely related framework for estimating dynamical statistics, such as committors and mean first passage times, by approximating solutions to their equations with a projection onto a basis. Because the projected dynamics are generally not memoryless, the Markov approximation can result in significant systematic errors. Inspired by quasi-Markov state models, which employ the generalized master equation to encode memory resulting from the projection, we reformulate DGA to account for memory and analyze its performance on two systems: a two-dimensional triple well and the AIB9 peptide. We demonstrate that our method is robust to the choice of basis and can decrease the time series length required to obtain accurate kinetics by an order of magnitude.more » « less
- 
            Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics, such as the likelihood and average time of events (predictions). Here, we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a dataset of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.more » « less
- 
            Abstract Blocking events are an important cause of extreme weather, especially long‐lasting blocking events that trap weather systems in place. The duration of blocking events is, however, underestimated in climate models. Explainable Artificial Intelligence are a class of data analysis methods that can help identify physical causes of prolonged blocking events and diagnose model deficiencies. We demonstrate this approach on an idealized quasigeostrophic (QG) model developed by Marshall and Molteni (1993),https://doi.org/10.1175/1520‐0469(1993)050<1792:taduop>2.0.co;2. We train a convolutional neural network (CNN), and subsequently, build a sparse predictive model for the persistence of Atlantic blocking, conditioned on an initial high‐pressure anomaly. Shapley Additive ExPlanation (SHAP) analysis reveals that high‐pressure anomalies in the American Southeast and North Atlantic, separated by a trough over Atlantic Canada, contribute significantly to prediction of sustained blocking events in the Atlantic region. This agrees with previous work that identified precursors in the same regions via wave train analysis. When we apply the same CNN to blockings in the ERA5 atmospheric reanalysis, there is insufficient data to accurately predict persistent blocks. We partially overcome this limitation by pre‐training the CNN on the plentiful data of the Marshall‐Molteni model, and then using Transfer learning (TL) to achieve better predictions than direct training. SHAP analysis before and after TL allows a comparison between the predictive features in the reanalysis and the QG model, quantifying dynamical biases in the idealized model. This work demonstrates the potential for machine learning methods to extract meaningful precursors of extreme weather events and achieve better prediction using limited observational data.more » « less
- 
            We show how to obtain improved active learning methods in the agnostic (adversarial noise) setting by combining marginal leverage score sampling with non- independent sampling strategies that promote spatial coverage. In particular, we propose an easily implemented method based on the pivotal sampling algorithm, which we test on problems motivated by learning-based methods for parametric PDEs and uncertainty quantification. In comparison to independent sampling, our method reduces the number of samples needed to reach a given target accuracy by up to 50%. We support our findings with two theoretical results. First, we show that any non-independent leverage score sampling method that obeys a weak one-sided l∞ independence condition (which includes pivotal sampling) can actively learn d dimensional linear functions with O(d log d) samples, matching independent sampling. This result extends recent work on matrix Chernoff bounds under l∞ independence, and may be of interest for analyzing other sampling strategies beyond pivotal sampling. Second, we show that, for the important case of polynomial regression, our pivotal method obtains an improved bound on O(d) samples.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available